Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Comput Aided Mol Des ; 37(1): 17-37, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36404382

RESUMO

One solution to the challenge of choosing an appropriate clustering algorithm is to combine different clusterings into a single consensus clustering result, known as cluster ensemble (CE). This ensemble learning strategy can provide more robust and stable solutions across different domains and datasets. Unfortunately, not all clusterings in the ensemble contribute to the final data partition. Cluster ensemble selection (CES) aims at selecting a subset from a large library of clustering solutions to form a smaller cluster ensemble that performs as well as or better than the set of all available clustering solutions. In this paper, we investigate four CES methods for the categorization of structurally distinct organic compounds using high-dimensional IR and Raman spectroscopy data. Single quality selection (SQI) forms a subset of the ensemble by selecting the highest quality ensemble members. The Single Quality Selection (SQI) method is used with various quality indices to select subsets by including the highest quality ensemble members. The Bagging method, usually applied in supervised learning, ranks ensemble members by calculating the normalized mutual information (NMI) between ensemble members and consensus solutions generated from a randomly sampled subset of the full ensemble. The hierarchical cluster and select method (HCAS-SQI) uses the diversity matrix of ensemble members to select a diverse set of ensemble members with the highest quality. Furthermore, a combining strategy can be used to combine subsets selected using multiple quality indices (HCAS-MQI) for the refinement of clustering solutions in the ensemble. The IR + Raman hybrid ensemble library is created by merging two complementary "views" of the organic compounds. This inherently more diverse library gives the best full ensemble consensus results. Overall, the Bagging method is recommended because it provides the most robust results that are better than or comparable to the full ensemble consensus solutions.


Assuntos
Algoritmos , Aprendizado de Máquina , Análise Espectral , Análise por Conglomerados
2.
J Chem Inf Model ; 62(24): 6316-6322, 2022 12 26.
Artigo em Inglês | MEDLINE | ID: mdl-35946899

RESUMO

The Molecular Education and Research Consortium in Undergraduate Computational Chemistry (MERCURY) has supported a diverse group of faculty and students for over 20 years by providing computational resources as well as networking opportunities and professional support. The consortium comprises 38 faculty (42% women) at 34 different institutions, who have trained nearly 900 undergraduate students, more than two-thirds of whom identify as women and one-quarter identify as students of color. MERCURY provides a model for the support necessary for faculty to achieve professional advancement and career satisfaction. The range of experiences and expertise of the consortium members provides excellent networking opportunities that allow MERCURY faculty to support each other's teaching, research, and service needs, including generating meaningful scientific advancements and outcomes with undergraduate researchers as well as being leaders at the departmental, institutional, and national levels. While all MERCURY faculty benefit from these supports, the disproportionate number of women in the consortium, relative to their representation in computational sciences generally, produces a sizable impact on advancing women in the computational sciences. In this report, the women of MERCURY share how the consortium has benefited their careers and the careers of their students.


Assuntos
Química Computacional , Estudantes , Humanos , Feminino , Masculino , Docentes , Pesquisadores
4.
J Cheminform ; 14(1): 35, 2022 Jun 07.
Artigo em Inglês | MEDLINE | ID: mdl-35672835

RESUMO

Facing the continuous emergence of new psychoactive substances (NPS) and their threat to public health, more effective methods for NPS prediction and identification are critical. In this study, the pharmacological affinity fingerprints (Ph-fp) of NPS compounds were predicted by Random Forest classification models using bioactivity data from the ChEMBL database. The binary Ph-fp is the vector consisting of a compound's activity against a list of molecular targets reported to be responsible for the pharmacological effects of NPS. Their performance in similarity searching and unsupervised clustering was assessed and compared to 2D structure fingerprints Morgan and MACCS (1024-bits ECFP4 and 166-bits SMARTS-based MACCS implementation of RDKit). The performance in retrieving compounds according to their pharmacological categorizations is influenced by the predicted active assay counts in Ph-fp and the choice of similarity metric. Overall, the comparative unsupervised clustering analysis suggests the use of a classification model with Morgan fingerprints as input for the construction of Ph-fp. This combination gives satisfactory clustering performance based on external and internal clustering validation indices.

5.
ACS Omega ; 6(47): 32151-32165, 2021 Nov 30.
Artigo em Inglês | MEDLINE | ID: mdl-34870036

RESUMO

The rapid emergence of novel psychoactive substances (NPS) poses new challenges and requirements for forensic testing/analysis techniques. This paper aims to explore the application of unsupervised clustering of NPS compounds' infrared spectra. Two statistical measures, Pearson and Spearman, were used to quantify the spectral similarity and to generate similarity matrices for hierarchical clustering. The correspondence of spectral similarity clustering trees to the commonly used structural/pharmacological categorization was evaluated and compared to the clustering generated using 2D/3D molecular fingerprints. Hybrid model feature selections were applied using different filter-based feature ranking algorithms developed for unsupervised clustering tasks. Since Spearman tends to overestimate the spectral similarity based on the overall pattern of the full spectrum, the clustering result shows the highest degree of improvement from having the nondiscriminative features removed. The loading plots of the first two principal components of the optimal feature subsets confirmed that the most important vibrational bands contributing to the clustering of NPS compounds were selected using non-negative discriminative feature selection (NDFS) algorithms.

6.
J Chem Theory Comput ; 12(8): 3571-82, 2016 Aug 09.
Artigo em Inglês | MEDLINE | ID: mdl-27294314

RESUMO

The myriad conformers of the neutral form of natural amino acid serine (Ser) have been investigated by systematic computations with reliable electronic wave function methods. A total of 85 unique conformers were located using the MP2/cc-pVTZ level of theory. The 12 lowest-energy conformers of serine fall within a 8 kJ mol(-1) window, and for these species, geometric structures, precise relative energies, equilibrium and vibrationally averaged rotational constants, anharmonic vibrational frequencies, infrared intensities, quartic and sextic centrifugal distortion constants, dipole moments, and (14)N nuclear quadrupole coupling constants were computed. The relative energies were refined through composite focal-point analyses employing basis sets as large as aug-cc-pV5Z and correlation treatments through CCSD(T). The rotational constants for seven conformers measured by Fourier-transform microwave spectroscopy are in good agreement with the vibrationally averaged rotational constants computed in this study. Our anharmonic vibrational frequencies are compared to the large number of experimental vibrational absorptions attributable to at least six conformers.


Assuntos
Gases/química , Serina/química , Ligação de Hidrogênio , Conformação Molecular , Espectrofotometria Infravermelho , Termodinâmica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...